(logo)
(navigation image)
Home Donate | Forums | FAQs | Contributions | Terms, Privacy, & Copyright | Contact | Jobs | Bios

Search: Advanced Search

UploadAnonymous User (login or join us) 
Frequently Asked Questions
 
Questions

How can I get my site included in the Wayback Machine?

How can I remove my site's pages from the Wayback Machine?

What is the Internet Archive Wayback Machine?

Can I link to old pages on the Wayback Machine?

Why isn't the site I'm looking for in the archive?

What does it mean when a site's archive data has been "updated"?

Who was involved in the creation of the Internet Archive Wayback Machine?

How was the Wayback Machine made?

How large is the Wayback Machine?

What type of machinery is used in this Internet Archive?

How do you archive dynamic pages?

Why are some sites harder to archive than others?

Some sites are not available because of robots.txt or other exclusions. What does that mean?

How can I help the Internet Archive and the Wayback Machine?

Can I search the Archive?

Why am I getting broken or gray images on a site?

How do I contact the Internet Archive?

What is the Wayback Machine's Copyright Policy?

Why is the Internet Archive collecting sites from the Internet? What makes the information useful?

Do you archive email? Chat?

Do you collect all the sites on the Web?

Is there any personal information in these collections?

Who has access to the collections? What about the public?

How can I get a copy of the pages on my Web site? If my site got hacked or damaged, could I get a backup from the Archive?'

Can people download sites from the Wayback?

How do you protect my privacy if you archive my site?

What does 'failed connection' and other error messages mean?

Why are there no recent archives in the Wayback Machine?

How does the Wayback Machine behave with Javascript turned off?

How did I end up on the live version of a site? or I clicked on X date, but now I am on Y date, how is that possible?

Where does the name come from?

How do I cite Wayback Machine urls in MLA format?

How can I get pages authenticated from the Wayback Machine? How can use the pages in court?

The Wayback Machine

How can I get my site included in the Wayback Machine?

Alexa Internet has been crawling the web since 1996, which has resulted in a massive archive. If you have a web site, and you would like to ensure that it is saved for posterity in the Internet Archive, and you've searched wayback and found no results, you can visit the Alexa's "Webmasters" page at http://pages.alexa.com/help/webmasters/index.html#crawl_site.

Method 2: if you have the Alexa tool bar installed, just visit a site.

Method 3: while visiting a site, use the 'show related links' in Internet Explorer, which uses the Alexa service.

In all cases, ensure that your site's 'robots.txt' rules and in-page META robots directives do not tell crawlers to avoid your site.

Sites are usually crawled within 8 weeks of submission, sometimes much sooner. However, there is at least a 6 month lag between the date a site is crawled and the date it appears in the Wayback Machine.

How can I remove my site's pages from the Wayback Machine?

The Internet Archive is not interested in preserving or offering access to Web sites or other Internet documents of persons who do not want their materials in the collection. By placing a simple robots.txt file on your Web server, you can exclude your site from being crawled as well as exclude any historical pages from the Wayback Machine.

Internet Archive uses the exclusion policy intended for use by both academic and non-academic digital repositories and archivists. See our exclusion policy.

Here are directions on how to automatically exclude your site. If you cannot place the robots.txt file, opt not to, or have further questions, email us at info at archive dot org.

What is the Internet Archive Wayback Machine?

The Internet Archive Wayback Machine is a service that allows people to visit archived versions of Web sites. Visitors to the Wayback Machine can type in a URL, select a date range, and then begin surfing on an archived version of the Web. Imagine surfing circa 1999 and looking at all the Y2K hype, or revisiting an older version of your favorite Web site. The Internet Archive Wayback Machine can make all of this possible.

Can I link to old pages on the Wayback Machine?

Yes! The Wayback Machine is built so that it can be used and referenced. If you find an archived page that you would like to reference on your Web page or in an article, you can copy the URL. You can even use fuzzy URL matching and date specification... but that's a bit more advanced.

Why isn't the site I'm looking for in the archive?

Some sites may not be included because the automated crawlers were unaware of their existence at the time of the crawl. It's also possible that some sites were not archived because they were password protected, blocked by robots.txt, or otherwise inaccessible to our automated systems. Siteowners might have also requested that their sites be excluded from the Wayback Machine. When this has occurred, you will see a "blocked site error" message. When a site is excluded because of robots.txt you will see a "robots.txt query exclusion error" message.

What does it mean when a site's archive data has been "updated"?

When our automated systems crawl the web every few months or so, we find that only about 50% of all pages on the web have changed from our previous visit. This means that much of the content in our archive is duplicate material. If you don't see ""*"" next to an archived document, then the content on the archived page is identical to the previously archived copy.

Who was involved in the creation of the Internet Archive Wayback Machine?

"The original idea for the Internet Archive Wayback Machine began in 1996, when the Internet Archive first began archiving the web. Now, five years later, with over 100 terabytes and a dozen web crawls completed, the Internet Archive has made the Internet Archive Wayback Machine available to the public. The Internet Archive has relied on donations of web crawls, technology, and expertise from Alexa Internet and others. The Internet Archive Wayback Machine is owned and operated by the Internet Archive."

How was the Wayback Machine made?

Alexa Internet, in cooperation with the Internet Archive, has designed a three dimensional index that allows browsing of web documents over multiple time periods, and turned this unique feature into the Wayback Machine.

How large is the Wayback Machine?

The Internet Archive Wayback Machine contains almost 2 petabytes of data and is currently growing at a rate of 20 terabytes per month. This eclipses the amount of text contained in the world's largest libraries, including the Library of Congress.

What type of machinery is used in this Internet Archive?

Much of the Internet Archive is stored on hundreds of slightly modified x86 servers. The computers run on the Linux operating system. Each computer has 512Mb of memory and can hold just over 1 Terabyte of data on ATA disks. However we are developing a new way of storing our data on a smaller machine. Each machine will store 1 terabyte. For more information go to www.petabox.org.

How do you archive dynamic pages?

There are many different kinds of dynamic pages, some of which are easily stored in an archive and some of which fall apart completely. When a dynamic page renders standard html, the archive works beautifully. When a dynamic page contains forms, JavaScript, or other elements that require interaction with the originating host, the archive will not contain the original site's functionality.

Why are some sites harder to archive than others?

If you look at our collection of archived sites, you will find some broken pages, missing graphics, and some sites that aren't archived at all. Here are some things that make it difficult to archive a web site:

  • Robots.txt -- We respect robot exclusion headers.
  • Javascript -- Javascript elements are often hard to archive, but especially if they generate links without having the full name in the page. Plus, if javascript needs to contact the originating server in order to work, it will fail when archived.
  • Server side image maps -- Like any functionality on the web, if it needs to contact the originating server in order to work, it will fail when archived.
  • Unknown sites -- The archive contains crawls of the Web completed by Alexa Internet. If Alexa doesn't know about your site, it won't be archived. Use the Alexa Toolbar (available at www.alexa.com), and it will know about your page. Or you can visit Alexa's Archive Your Site page at http://pages.alexa.com/help/webmasters/index.html#crawl_site.
  • Orphan pages -- If there are no links to your pages, the robot won't find it (the robots don't enter queries in search boxes.)
As a general rule of thumb, simple html is the easiest to archive.

Some sites are not available because of robots.txt or other exclusions. What does that mean?

The Standard for Robot Exclusion (SRE) is a means by which web site owners can instruct automated systems not to crawl their sites. Web site owners can specify files or directories that are disallowed from a crawl, and they can even create specific rules for different automated crawlers. All of this information is contained in a file called robots.txt. While robots.txt has been adopted as the universal standard for robot exclusion, compliance with robots.txt is strictly voluntary. In fact most web sites do not have a robots.txt file, and many web crawlers are not programmed to obey the instructions anyway. However, Alexa Internet, the company that crawls the web for the Internet Archive, does respect robots.txt instructions, and even does so retroactively. If a web site owner decides he / she prefers not to have a web crawler visiting his / her files and sets up robots.txt on the site, the Alexa crawlers will stop visiting those files and will make unavailable all files previously gathered from that site. This means that sometimes, while using the Internet Archive Wayback Machine, you may find a site that is unavailable due to robots.txt (you will see a "robots.txt query exclusion error" message). Sometimes a web site owner will contact us directly and ask us to stop crawling or archiving a site, and we endevor to comply with these requests. When you come accross a "blocked site error" message, that means that a siteowner has made such a request and it has been honored.

How can I help the Internet Archive and the Wayback Machine?

The Internet Archive actively seeks donations of digital materials for preservation. If you have digital materials that may be of interest to future generations, please let us know by sending an email to info at archive dot org. The Internet Archive is also seeking additional funding to continue this important mission. You can click the donate tab above or click here. Thank you for considering us in your charitable giving.

Can I search the Archive?

Using the Internet Archive Wayback Machine, it is possible to search for the names of sites contained in the Archive (URLs) and to specify date ranges for your search. We hope to implement a full text search engine at some point in the future.

Why am I getting broken or gray images on a site?

Broken images (when there is a small red "x" where the image should be) occur when the images are not available on our servers. Usually this means that we did not archive them. Gray images are the result of robots.txt exclusions. The site in question may have blocked robot access to their images directory.

How do I contact the Internet Archive?

All questions about the Wayback Machine, or other Internet Archive projects, should be addressed to info at archive dot org.

What is the Wayback Machine's Copyright Policy?

The Internet Archive respects the intellectual property rights and other proprietary rights of others. The Internet Archive may, in appropriate circumstances and at its discretion, remove certain content or disable access to content that appears to infringe the copyright or other intellectual property rights of others. If you believe that your copyright has been violated by material available through the Internet Archive, please provide the Internet Archive Copyright Agent with the following information:

  • Identification of the copyrighted work that you claim has been infringed;
  • An exact description of where the material about which you complain is located within the Internet Archive collections;
  • Your address, telephone number, and email address;
  • A statement by you that you have a good-faith belief that the disputed use is not authorized by the copyright owner, its agent, or the law;
  • A statement by you, made under penalty of perjury, that the above information in your notice is accurate and that you are the owner of the copyright interest involved or are authorized to act on behalf of that owner;
  • Your electronic or physical signature.

Internet Archive uses the exclusion policy intended for use by both academic and non-academic digital repositories and archivists. See our full exclusion policy.

The Internet Archive Copyright Agent can be reached as follows:

Internet Archive Copyright Agent
Internet Archive
Presidio of San Francisco
P.O. Box 29244
San Francisco, CA 94129
Phone: 415-561-6767
Email: info at archive dot org

Why is the Internet Archive collecting sites from the Internet? What makes the information useful?

Most societies place importance on preserving artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. The Archive's mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars. The Archive collaborates with institutions including the Library of Congress and the Smithsonian.

Do you archive email? Chat?

No, we do not collect or archive chat systems or personal email messages that have not been posted to Usenet bulletin boards or publicly accessible online message boards.

Do you collect all the sites on the Web?

No, we collect only publicly accessible Web pages. We do not archive pages that require a password to access, pages tagged for "robot exclusion" by their owners, pages that are only accessible when a person types into and sends a form, or pages on secure servers. If a site owner properly requests removal of a Web site through http://www.archive.org/about/exclude.php, we will exclude that site from the Wayback Machine.

Is there any personal information in these collections?

We collect Web pages that are publicly accessible. These may include pages with personal information.

Who has access to the collections? What about the public?

Anyone can access our collections through our website archive.org. The web archive can be searched using the Wayback Machine.

The Archive makes the collections available at no cost to researchers, historians, and scholars. At present, it takes someone with a certain level of technical knowledge to access collections in a way other than our website, but there is no requirement that a user be affiliated with any particular organization.

How can I get a copy of the pages on my Web site? If my site got hacked or damaged, could I get a backup from the Archive?'

Our terms of use do not cover backups for the general public. However, you may use the Internet Archive Wayback Machine to locate and access archived versions of your web site. We can't guarantee that your site has been or will be archived. For siteowners only we offer limited backup capabilites. Send your request to info at archive dot org for more information.

Can people download sites from the Wayback?

Our terms of use specify that users of the Wayback Machine are not to copy data from the collection. If there are special circumstances that you think the Archive should consider, please contact info at archive dot org.

How do you protect my privacy if you archive my site?

The Archive collects Web pages that are publicly available — the same ones that you might find as you surfed around the Web. We do not archive pages that require a password to access, pages tagged for "robot exclusion" by their owners, pages that are only accessible when a person types into and sends a form, or pages on secure servers. We also provide information on removing a site from the collections. Those who use the collections must agree to certain terms of use.

Like a public library, the Archive provides free and open access to its collections to researchers, historians, and scholars. Our cultural norms have long promoted access to documents that were, but no longer are, publicly accessible.

Given the rate at which the Internet is changing — the average life of a Web page is only 77 days — if no effort is made to preserve it, it will be entirely and irretrievably lost. Rather than let this moment slip by, we are proceeding with documenting the growth and content of the Internet, using libraries as our model.

If you are interested in these issues, please join and contribute to our announcement and discussion lists.

What does 'failed connection' and other error messages mean?

Below is a list of the main error messages you will see while searching the Wayback Machine. If you see an error message that does not have the Internet Archive Wayback Machine logo in the upper left corner, you are most likely looking at an archived page or the live web.

Failed Connection: The server that the particular piece of information lives on is down. Generally these clear up within two weeks.

Robots.txt Query Exclusion: A robots.txt is something that a site owner puts on their site that keeps crawlers like our own from crawling them. The Internet Archive retroactively respects all robots.txt.

Blocked Site Error: Site owners, copyright holders and others who fit Internet Archive's exclusion policy have requested that the site be excluded from the Wayback Machine. For exclusion criteria, please see our exclusion policy (we use the same one used and developed by other digital repositories and archivists both academic and non-academic).

Path Index Error: A path index error message refers to a problem in our database wherein the information requested is not available (generally because of a machine or software issue, however each case can be different). We cannot always completely fix these errors in a timely manner.

Not in Archive: Generally this means that the site archived has a redirect on it and the site you are redirected to is not in the archive or cannot be found on the live web.

Why are there no recent archives in the Wayback Machine?

It generally takes 6 months or more for pages to appear in the Wayback Machine after they are collected, because of delays in transferring material to long-term storage and indexing.

There is no access to files before they appear in the Wayback Machine.

How does the Wayback Machine behave with Javascript turned off?

If you have Javascript turned off, images and links will be from the live web, not from our archive of old Web files.

How did I end up on the live version of a site? or I clicked on X date, but now I am on Y date, how is that possible?

Not every date for every site archived is 100% complete. When you are surfing an incomplete archived site the Wayback Machine will grab the closest available date to the one you are in for the links that are missing. In the event that we do not have the link archived at all, the Wayback Machine will look for the link on the live web and grab it if available. Pay attention to the date code embedded in the archived url. This is the list of numbers in the middle; it translates as yyyymmddhhmmss. For example in this url http://web.archive.org/web/20000229123340/http://www.yahoo.com/ the date the site was crawled was Feb 29, 2000 at 12:33 and 40 seconds.

Where does the name come from?

The Wayback Machine is named in reference to the famous Mr. Peabody's WABAC (pronounced way-back) machine from the Rocky and Bullwinkle cartoon show.

How do I cite Wayback Machine urls in MLA format?

This question is a newer one. We asked MLA to help us with how to cite an archived URL in correct format. They did say that there is no established format for resources like the Wayback Machine, but it's best to err on the side of more information. You should cite the webpage as you would normally, and then give the Wayback Machine information. They provided the following example: McDonald, R. C. "Basic Canary Care." _Robirda Online_. 12 Sept. 2004. 18 Dec. 2006 . _Internet Archive_. < http://web.archive.org/web/20041009202820/http://www.robirda.com/cancare.html>. They added that if the date that the information was updated is missing, one can use the closest date in the Wayback Machine. Then comes the date when the page is retrieved and the original URL. Neither URL should be underlined in the bibliography itself. Thanks MLA!

How can I get pages authenticated from the Wayback Machine? How can use the pages in court?

The Wayback Machine tool was not designed for legal use. We do have a legal request policy found at our legal page. Please read through the entire policy before contacting us with your questions. We do have a standard affidavit as well as a FAQ section for lawyers. We would prefer that before you contact us for such services, you see if the other side will stipulate instead. We do not have an in-house legal staff, so this service takes away from our normal duties. Once you have read through our policy, if you still have questions, please contact us for more information.

Questions

How can I add a thumbnail image to my item's details page?

How can I get iTunes to create a new playlist when I stream MP3s?

How can I play OGG files on a Mac?

I'm having trouble with a 'blank'/corrupted ZIP file. What do I do?

How can I add a logo to the upper right corner of my Netlabels collection?

How can I get my tracks to show up in the right order?

What kind of audio file should I submit?

The flash player is covering my files! How do I move it?

Audio

How can I add a thumbnail image to my item's details page?

First, make sure you're logged on to archive.org with the same email address you used to upload the item.

The image you upload must be named identifier.jpg (where identifier is your item's identifier name) and you must choose file format JPEG in the metadata editor.

To upload the image:

  • Go to your item's details page
  • Click the "Edit item" link in the lower left box
  • Click the Item Manager button
  • Click the "Check out files" button
  • Upload the image file to the item's directory using FTP
  • Check the item back in
  • After a few minutes, return to your item's details page. Click "Edit item" and find the .jpg file you just uploaded in the list of files near the bottom of this page. Select the file format JPEG from the drop down menu, and click the submit button.
  • Wait 5-20 minutes for your changes to show up. If you're still not seeing your new file, please try clearing your cache and viewing the page again, since you may still be looking at an old version of the page.

How can I get iTunes to create a new playlist when I stream MP3s?

As an iTunes user, you might have noticed that iTunes loads the Archive's streaming MP3s (M3U files) into your library, and subsequentially the files get shuffled and are out of order. We have come up with a solution to this problem.

Step by step instructions:

  • Download this AppleScript application.
  • Copy the m3uPlayer application to a permanent location
  • Choose some recording in the Archive to stream. This will cause an M3U to download to your default download folder (typically your desktop).
  • Click on the downloaded M3U file, hit option-I (or option-click and select Get Info). Change "open with" from ITunes to m3uPlayer (locate it wherever you saved it)
  • Click change all so that all future M3U files will open this way

That's it! If you have trouble, post a message to this forum

Thanks to http://www.balnaves.com/archives/000092.php for the code, instructions, and inspiration

How can I play OGG files on a Mac?

On the mac, there is a free component to ogg-ify itunes. The freeware VLC Media Player will also play OGG files. http://www.macosxhints.com/article.php?story=20020424233612407

I'm having trouble with a 'blank'/corrupted ZIP file. What do I do?

There are a variety of problems that may be causing this. Here are a couple of the most common. If you have a Mac running OS X, the default unzip utility (Stuffit) does not deal well with those Archive ZIP files that are 'compressed on the fly'. You may see an empty directory - if so, then try downloading Zip Tools for Mac OS X and using the drag and drop software within that to unzip your download. [Make sure you save your download to your desktop before trying things on it.] If you're having any trouble with downloads timing out or being incomplete, especially on Windows, then you may be able to use download managers such as GetRight. These will restart your download if it fails. However, some 'ZIP on the fly' downloads don't play well with download managers. If you find that to be the case, the safest thing to do is to download each track individually in a download manager, or use FTP to log in.

How can I add a logo to the upper right corner of my Netlabels collection?

First, make sure you're logged on to archive.org with the same email address you used when you created your Netlabels collection. Then:

  • Go to your collection's front page
  • Click the "edit" link next to the title
  • Click Item Manager
  • Check out the item's files
  • Upload the logo to the item's directory using FTP
  • Check the item back in
  • Return to collection front page and click "edit" link again
  • Find logo file at bottom of page, choose "Collection Header" from the drop down list and click submit.
It will take a few minutes for the changes to appear.

How can I get my tracks to show up in the right order?

The most reliable way to have your tracks appear on the page in the correct order is to name the individual files with track numbers, like this:
01_nameoffirstsong.mp3
02_nameofsecondsong.mp3
03_nameofthirdsong.mp3

(If you have more than 9 files you need to start numbering with 01 - not 1 - otherwise the files will go in this order: 1, 10, 11, 12, 2, 3 etc.)

If you have already created an item and you would like to change the file names to rearrange them correctly, do the following:

  1. Follow the directions in this FAQ to check out your item and connect via FTP
  2. Rename your original files using track numbers
  3. Delete all "derived" files, leaving only your original files and the .xml files
  4. Check the item back in
  5. Click "Edit item" > "Item Manager" and then click the "derive" button

It will take a little while for the derive to finish running, but once it does you'll have all new files, in the correct order, in both the flash player and the page itself.

What kind of audio file should I submit?

The archive is all about free access to information, so you should submit file formats that are easily downloadable and/or streamable for other site patrons.

We prefer that you submit the highest quality file that you have available, and then we will attempt to create smaller file sizes and formats automatically with our deriver program. We recommend that you do not attempt to do any special encoding of your files - the more settings you mess around with, the less likely our deriver code will be able to process the file.

If you are submitting a Live Music Archive item, please only submit Flac or Shorten files. Even for non-LMA items, these are the best formats to use.

Whatever format you choose, please upload each file to your item individually (you can submit multiple files per item), in a non-compressed format. Uploading content in a .zip or .rar file makes your item unstreamable and significantly less accessible to others. If you upload .zip, .rar, non-audio formats (like .exe), or password-protected files, they may be removed by our moderators.

The table below describes what file formats we will attempt to derive depending on what type of file you submit.

The flash player is covering my files! How do I move it?

If an item has little or no description, sometimes the flash player doesn't have enough room in the top portion of the page and covers the files below. If you don't want to add a description (which would be nice, so that people know what they're listening to), you can add extra space in the description field using paragraph tags.

  • Click the "Edit item" link in the lower left box
  • Add several paragraph tags to the description field, like this:
    <p>
    <p>
    <p>
    <p>
  • Click the submit button
After 10-20 minutes, when you return to your item you should see that the files have moved down further on the page, allowing the flash player enough room at the top. Usually 4-5 <p> tags is enough.

Questions

How do I view the DJVU books?

What is the status of the Internet Bookmobile?

How do I view the PDF books?

How do I download a book in tk3 format?

What equipment does the Bookmobile use to print and bind books?

What is the directory structure for the texts?

How do you remove line breaks from the Gutenberg texts?

What is the best way to link to a book?

Can I volunteer for the book project?

Texts and Books

How do I view the DJVU books?

DJVU is a open format for scanned documents. There are free readers available at:

http://www.celartem.com/en/download/djvu.asp

for windows, mac os-x, linux.
Try it. We like this compact, searchable, good looking, and open format.

What is the status of the Internet Bookmobile?

Internet Archive's Internet Bookmobile is currently out of commission.

How do I view the PDF books?

Books that are available in PDF format require Adobe Acrobat. The software is free to download and use.

How do I download a book in tk3 format?

This is a beautiful format, and well worth trying. To download a reader for Windows and Mac (pre OSX) go to http://www.nightkitchen.com/download/reader/index.phtml

What equipment does the Bookmobile use to print and bind books?

You can find a list of all the hardware and software used in the bookmobile here: http://www.archive.org/texts/bookmobile-in_it.php

You can also see a movie of a book being made here: http://www.archive.org/details/HowToMakeABookmov

What is the directory structure for the texts?

In order to store all the texts that the archive has, and will eventually acquire, the directory structure is:


IDENTIFIER/IDENTIFIER.extension (tif, djvu, pdf)

IDENTIFIER: Unique in Archive's collection, alphanumeric (URL safe), this is the original name adopted by the originating collection (alphanumeric characters and _-. Best if from 5 to 80 characters). One format is [title:8-16][vol:2][author:4][scanninglocation:0-4]

EXTENSIONS:

  • If the original files are tif files, then:
  • IDENTIFIER_orig.tif: All the orginal tiffs are stored in the form of multi page tiff. Demoware windows viewer Informatik Image Viewer. If it goes over 2GB, then it is stored as a tar of singlepage tifs the directory named IDENTIFIER_orig_tif/IDENTIFIER_orig_XXXX.tif resulting in a file called IDENTIFIER_orig_tif.tar
  • IDENTIFIER.tif: All the cleaned up tifs (usually cropped, despeckled, deskewed) are stored in the form of multi page tiffs. If it goes over 2GB, then it is stored as a tar of a directory named ./IDENTIFIER_tif/IDENTIFIER_XXXX.tif resulting in a file called IDENTIFIER_tif.tar

  • If the original files are JPEG JP2 or CR2 files, then:
  • All the original jpg files are used to make a zip file named IDENTIFIER_orig_jpg.zip where the names of the pages in the zipped directory are IDENTIFIER_orig_jpg/IDENTIFIER_orig_XXXX.jpg. If the resulting file is greater than 2GB (thus breaking the zip format until zip64 is common), then the file will be in tar format named IDENTIFIER_orig_jpg.tar . If the originals are jp2 or cr2 files, then substitute these extentions above.
  • Similarly all the processed jpg files (cropped and deskewed) are used to make a zip file named IDENTIFIER_jpg.zip where the names of the pages in the zipped directory are IDENTIFIER_jpg/IDENTIFIER_XXXX.jpg. If the resulting file is greater than 2GB (thus breaking the zip format until zip64 is common), then the file will be in tar format named IDENTIFIER_jpg.tar

  • In the case where there is a small jpg version of the files for on-screen access then a similar naming convention is used from the _orig.jpg version above, but with _200KB resulting in a file named IDENTIFIER_200KB_jpg.zip where the names of the pages in the zipped directory are IDENTIFIER_200KB_jpg/IDENTIFIER_200KB_XXXX.jpg. An equivalent version can be done with other sizes and different formats such as jp2.
  • IDENTIFIER.djvu: A nifty open scanned book format created by AT&T Labs and enhanced by LizardTech.com enabling compression and ease of reprinting. This file will also be ocr'd to make the text searchable.( /djvu/bin/documenttodjvu --filelist.txt temp.djvu, /djvu/bin --ocr aatttt.djvu)
  • IDENTIFIER_djvu.xml this is an xml version of the OCR output which has the word positions (as a bounding box). this is used for building the djvu file, and is used for searching the flip books, and maybe constructing a searchable pdf in the future.
  • IDENTIFIER.pdf: Adobe acrobat format that is derived from the .tif file if present.
  • IDENTIFIER.txt.tar.gz or .art.tar.gz: If there are OCR'ed text files associated with each page, these are tarred and gzipped in txt format or art which is sakhr format.
  • IDENTIFIER_cover.doc or .sxw:
    cover of the book, some in legal and some letter. doc is Microsoft Word, and sxw is OpenOffice.
  • IDENTIFIER_xxxx_bookplate.jp2 or .jpg: is the file that has a bookplate that acknowledges those behind creating the digital version. xxxx is the page that it will replace in the access formats.


  • IDENTIFIER_meta.xml: This has the catalog data (title, author, publisher, copyright information) and information about the book found while scanning (size, who scanned it) stored in a dublincore-like XML format.
  • IDENTIFIER_meta.mrc: This will be the MARC (Machine Readable Cataloging) records for the book which provides the mechanism by which computers exchange, use and interpret bibliographic information and its data elements make up the foundation of most library catalogs used today.
  • IDENTIFIER_marc.xml: marcxml format of marc record
  • IDENTIFIER_metasource.xml: where the metadata information came from (metadata about the metadata :) ).

  • LEGACY FORMATS: This could be OTIFF | PTIFF | TXT.
    • OTIFF: These are the original tiff images of the scans of the books. (to create multipage tifs we used a unix util: tiffcp OTIFF/*.tif aaattt_orig.tif)
    • PTIFF: These are processed images (cropped,desqewed,depeckled) from the originaltiffs.
    • TXT: These are the text files that have been created by doing Optical Character Recoginiton (OCR) on the tiff images.
    * We plan to eventually remove OTIFF|PTIFF|TXT directories.

    How do you remove line breaks from the Gutenberg texts?

    In Word use find and replace 3 times:

    Step 1. Find two paragraph markers - ^p^p

    Replace with a neutral character ~ or # or @

    Step 2. Find one para markers - ^p

    Replace with a single space

    (This might take about 10-15 minutes on large files)

    Step 3. Put 2 para markers back in - find ~

    Replace ^p^p

    What is the best way to link to a book?

    Every book in the Archive has an identifier. For example, RomeoAndJuliet. To link to the book, you should use the following URL:

    http://www.archive.org/download/RomeoAndJuliet

    Can I volunteer for the book project?

    Volunteers are welcome to come to our San Francisco location during business hours and help make books. These books are given out as calling cards and thank you gifts to help raise awareness to the Internet Archive. Please write to info at archive dot org for more information or to make an appointment.

    Questions

    A recording I uploaded and marked 'no lossy formats' had them created (mp3, ogg, m3u, etc...) . How can I remove them?

    What is the Live Music Archive all about?

    What are MD5 files?

    What are FLAC files and how can I listen to them?

    What are FFP files?

    Why are there no shows by band X?

    There's no setlist for this show - OR - The setlist does not match up with the number of files. Should I submit an error report?

    How do I burn FLAC files to CD as audio tracks?

    How do I burn SHN files to CD as audio tracks?

    What is the status of band X for the Archive?

    I'm an artist who would like to be included in the Archive, what do I need to do?

    Can I upload concert videos?

    The progress of my upload says 'File metadata XML invalid. Waiting for user to correct.' How can I fix this?

    I have more Live Music Archive questions...who do I ask?

    I have a different source for a show that is already in the archive, should I upload it anyway?

    How can I help get bands into the Live Music Archive?

    When I download concerts, I constantly get disconnected before the download completes. What can I do to fix this?

    What are the WAV MD5 files that are sometimes in filesets?

    I just uploaded a directory that contained WAV MD5 checksums, is that OK?

    My failure email is indicating that the text file failed. What can I do?

    When I try to connect to a server via FTP, I get the error 'connection timeout.' How can I fix this?

    Can bands place restrictions on material to be archived?

    I just uploaded a show and all the files fail the MD5 check, what's the deal?

    Where have all the Dave Matthews Band concerts gone? Will they be back?

    Why is there no Phish? What about Widespread Panic?

    I used to use a download manager and now it stopped working. What's the deal?

    What's the deal with magic number errors?

    Do you provide an RSS feed of new updates to the LMA?

    What does the 'Transferred by' field mean?

    Why don't I get an email when my uploads fail MD5 checksums?

    Can I log into an FTP server to download concerts?

    My in-progress upload says ' No metadata describing files found. Waiting for user to enter metadata' - what do I do?

    Can I upload live recordings that were broadcast on XM Radio or Sirius Satellite Radio?

    The Grateful Dead is here, when will we see Jerry Garcia recordings?

    Regarding removing the lossy files ... I edited my show, checked the box to remove them and clicked update. Now when I click update again, the box is still not checked. Why?

    The upload instructions require a 'FLAC Fingerprint' file with my recording - how can I create this?

    I've got a great 'filler' for the recording I am about to upload to the collection - should I include it?

    Where can I find other recordings by [trade-friendly band] that aren't in the collection?

    What are SHN files and how can I listen to them?

    I tried downloading a show and I got a '403 Forbidden' page. Why?

    How do I upload a show to the LMA?

    How do I make corrections to shows?

    What file formats are accepted for contributions to the Live Music Archive?

    I like adding concerts. Do you have a preference on the way I put in information?

    About Grateful Dead concerts on the Archive

    What are the options for streaming a full recording?

    What are the options for downloading a full recording?

    Where can I see the rest of the 'Most Downloaded Items' in the Live Music Archive?

    Where can I see the rest of the 'Top Batting Averages' of shows in the Live Music Archive?

    Live Music Archive

    A recording I uploaded and marked 'no lossy formats' had them created (mp3, ogg, m3u, etc...) . How can I remove them?

    If you come across this situation and you are the uploader, click [edit] and then 'Update'. You should see the message "Format Options Updated Successfully". Within 10 minutes the system will create a "_rules.conf" file in the recording's folder. Then, the next time the system performs an automatic sweep looking for changes, it will notice the new rules file and remove the lossy files automatically. The sweep occurs approximately twice a day, so you should see the files removed within 12-24 hours.

    If you are not the uploader, fill out an error report letting us know that the derivatives shouldn't be there and an admin will remove them when they get to the error report.

    What is the Live Music Archive all about?

    This audio archive is an online public library of live recordings available for royalty-free, no-cost public downloads. We only host material by trade-friendly artists: those who like the idea of noncommercial distribution of some or all of their live material. Live recordings are a part of our culture and might be lost in 100 years if they're not archived. We think music matters and want to preserve it for future generations.

    The LMA draws strength from the members of etree.org and other online communities of music fans devoted to providing public access to high-quality digital recordings of tradable performances. Typically, recordings are made by the fans themselves. Recordings are preserved in "Lossless" archival compression formats such as Shorten or FLAC (MP3 is not Lossless) for highest quality preservation.

    Patrons may download from the LMA with the understanding that the artists still hold their copyrights. All material is strictly noncommercial, both for access here and for any further distribution.

    What are MD5 files?

    MD5 files contain checksums, strings of characters used to uniquely represent a file. These checksums enable users to verify that music files downloaded correctly.

    A recommended tool for creating these files is MD5summer. Please note that before uploading the MD5 created with this tool you should open the MD5 in a text editor and remove the top 3 lines so the first signature is now flush with the top of the file.

    What are FLAC files and how can I listen to them?

    FLAC stands for free lossless audio codec. It is an open source, lossless compression algorithm for digital music. It compresses music files to 50-60% of their original size, with no loss in quality. More FLAC information can be found on the FLAC sourceforge site and in this etree FAQ.

    If you upload FLAC filesets to the LMA, please follow the naming standards to help the checking program here. Directories should be named with .flac16 or .flac24 suffix, not .flac. Otherwise, the program will report failures.

    To listen to FLAC files:

    Macintosh: Download and install MacAmp Lite, a multi-format audio player, and then install the FLAC Plugin for MacAmp.

    Windows: Download and install WinAmp, a multi-format audio player, and then install the FLAC Plugin for WinAmp. If you would like to use FLAC with your Windows Media Player (WMP) download and install the Directshow Filters for Ogg Vorbis, Speex, Theora and FLAC. This will allow WMP to not only play .flac files but .ogg files as well.

    Linux or any other UNIX-based architecture: Download and copy "libxmms-flac.so" to your XMMS media player input plugins folder.

    What are FFP files?

    FFP files contain checksums, strings of characters used to uniquely represent a FLAC file. These checksums enable users to verify which particular source a file comes from.

    Why are there no shows by band X?

    We'd like to make sure that a trade-friendly band would not mind having their shows in the Archive for public download. The best way for us to find out is by getting permission from a band representative or by the band's having an explicit policy that covers this type of site. If there are no shows by the band, either we don't have enough of this information to go forward with archiving, they have declined participation, or we are ready to accept shows but no one has uploaded anything yet. (Also, see the band status FAQ).

    Trade-unfriendly bands will not be found in the Archive, nor will otherwise trade-friendly bands who have declined to have material archived here.

    Bands, see other relevant FAQs here and here. Patrons, see more about how you can help here.

    There's no setlist for this show - OR - The setlist does not match up with the number of files. Should I submit an error report?

    There has been an increasing number of shows uploaded to the Live Music collection without setlist information, or the setlist was not properly matched to the files. When you notice a recording like this, please submit an error report only if you have an updated setlist, or you are able to match the files up correctly.

    We would prefer that you do not submit error reports letting us know that there is no setlist - tracking down setlists for every concert and matching them up to the recordings is a monumental task that has grown beyond the capabilities of the small group of Archive.org admins. We would like fans that are familiar with each artist's material to help us with this project - in your error report, please give us specific instructions on what changes to make and we will do so.

    How do I burn FLAC files to CD as audio tracks?

    You will first need to convert the FLAC files to another format that your burning program is familiar with. Windows users can use the FLAC Frontend, to convert FLAC files to WAV files, which are suitable for burning programs. For Macintosh OS X users, Dan Greuel has created a tool called MacFLAC.

    How do I burn SHN files to CD as audio tracks?

    You will first need to convert the SHN files to another format that your burning program is familiar with. The following programs will convert SHN files to WAV files, which can be burned to a CD. More resources are listed in this FAQ.

    Macintosh: Download and install Doug Hornig's tool, appropriately titled, Shorten for Macintosh.

    Windows: Download and install Michael K. Weise's tool, mkwACT. Or, another good tool is Foobar2000 - make sure you get the "Special" version to have Shorten compatibility!

    Linux or any other UNIX-based architecture: Download and install shorten.

    What is the status of band X for the Archive?

    5/2006, significant site changes in progress: Formerly, you could check on the status of a band relative to the Archive on the Trade-Friendly Band Information page, which is no longer updated. This FAQ question has been updated for the new-system presentation of info. We have 3 categories:

    May be Archived- Band sections have been activated by Archive admins. Shows can be hosted here to the extent permitted by the band. Click on the band name and then through to their Policy Notes link to see what limits they may have placed on taping, trading or archiving.

    Pending- When a patron sends us information about having contacted an additional trade-friendly band, the new band is considered to be "Pending". Admins will update notes we keep on the band based on the information that people send to etree at archive dot org. (Sensitive parts of the info- such as email addresses used- will not be posted in the public notes.)

    Important: Under the new system, we cannot create a "collection page" for the band name unless and until we know that the band May Be Archived. Further, no shows may be uploaded for any band in advance of a band section's activation. Under the new system, there is no temporary "upload area" to store filesets for bands whose sections are not prepared yet. Please send shows for bands on the active list only.

    Opted Out- Some bands that may be otherwise trade-friendly may have explicitly said, "No, thanks" to our project. We respect their wishes. We still keep notes of their taping/trading policies for reference.

    If your favorite band name is not in any of these 3 categories, there are several possible reasons: They may not be trade-friendly in the first place. No one may have contacted them yet. Someone who contacted them may not have informed us yet. The band may not have written us back yet. If a band did write to us, we may not have had a chance to activate a section yet, or we may not have received enough information back from them to setup their section. In some cases, we may not have received the email successfully, so that a resend may be necessary.

    Bands, see other relevant FAQs here and here. Patrons, see more about how you can help here.

    I'm an artist who would like to be included in the Archive, what do I need to do?

    We'd love to have you! Just write to us at etree at archive dot org in English giving some kind of permission for us to archive your shows for public download and noncommercial, royalty-free circulation. It does not need to be a formally worded declaration, and can come from anyone you feel has the "say-so." We just need to be clear on how you feel about the project. We will put relevant quotes onto a new "collection" page (examples) for your performances, along with a link to your official website.

    It is necessary for you to email us at etree at archive dot org in order to create a new section. We want to be sure that the go-ahead really is coming from you. Please do not attempt to create your own collection, or to upload any of the band's shows, in advance of receiving an emailed confirmation message from curators; such attempts may significantly complicate or delay the curators' setup process.

    You can give as much or as little scope for archiving as you like. Some bands place limits on what can be hosted, and we can accomodate those. Archive Curators, volunteer fans who have proven to be in line with the spirit of this archive, will attempt to screen contributions for OK'ed material only.

    At the same time you give the go-ahead, feel free to pass along any notes or policy links on your general taping/trading stance as well. You don't need to have a formal written or posted policy before inclusion, but we'd like to know how you feel about the topic.

    Besides fans' sending their copies of your shows, you can also prepare and upload your own live recordings to the Archive, if you like. In fact, if you'd like to limit your material to selected contributions from you only, please just let us know.

    If you have any questions about the project, please ask us anytime at etree at archive dot org.

    Can I upload concert videos?

    At this time, video uploads are not being accepted, namely because most of the bands archived prohibit the video taping of their shows. Moreover, unlike audio, where we actually have a shot at archiving the vast majority of any given band's live concerts (in very high quality format), video is scarce and, unless made by the artist (in which case, it's typically for commercial purposes), is not of particularly good quality.

    The progress of my upload says 'File metadata XML invalid. Waiting for user to correct.' How can I fix this?

    This is typically caused by illegal symbols being used somewhere in the information that was put into one of the forms submitted with the show (either the import form or "File Options"). Double check that the only characters being used are those visible on a standard English-language 104 key keyboard. More information and a few examples are here.

    If you have trouble finding the cause, please post to the forum for help. An admin will have to resubmit the recording for another try, so please send an email including a link to the recording to etree AT archive DOT org if you believe you have cleared the issue.

    More information on what XML files are and how they are created can be read here.

    I have more Live Music Archive questions...who do I ask?

    Feel free to email etree at archive dot org with any questions, and we'll do our best to post the answers here as soon as possible. Also, the message board is a great resource; with so many kind, knowledgable folks out there, you can often get a speedy answer to your question.

    I have a different source for a show that is already in the archive, should I upload it anyway?

    Yes! In keeping with the nature of this Archive, it is appropriate for multiple sources of the same show to be available for download. When you upload the new source, be sure to name the source in the show's top level folder to avoid confusion. Some bands do place limits on the types of sources allowed (such as soundboard recordings), so please check the policy for any given band.

    How can I help get bands into the Live Music Archive?

    If you know of a trade-friendly live-performing band that is a good candidate for the Archive, you can initiate contact. Some tips and letter templates can be found here. When you write, make it clear you are asking about the Live Music Archive at archive.org. Don't just ask about their general taping/trading stance. We want bands to know what's up.

    Next, follow up with a message to etree at archive dot org. Mention when you tried to contact the band and what contact point you used. These are important in order to update our contact records. Admins will update the contact status in an announcement forum about Pending Bands based on the message you send us.

    If you receive a reply from the band, positive or negative, send a complete copy of the email, complete with its sender's address/brief header info, to etree at archive dot org. It's a good idea to send a copy of what you asked them as well (if not quoted in the reply), since it will give context to the answer. We need to have full info in hand in order to set up the band appropriately in the Archive, and we may need to contact them for followup questions.

    If you are hesitant to make contact yourself, you can mention the band to Archive admins (send email to etree at archive dot org) and they can try a contact as time permits. To help out, supply any contact or policy info you may already know about the band.

    When I download concerts, I constantly get disconnected before the download completes. What can I do to fix this?

    If you are downloading large files from the collection with your Internet Browser and experience trouble maintaining a reliable connection to our servers, we recommend that you use FTP instead (File Transfer Protocol). Almost all FTP clients will allow your download to resume if the connection get broken. In addition, many will allow you to set up a queue of files that will automatically reconnect and resume when it notices that the transfer has stopped.

    For a list of recommended free FTP clients, see this FAQ.

    What are the WAV MD5 files that are sometimes in filesets?

    MD5 checksums files are not exclusive to SHN files. An MD5 checksum can be used to ensure the accuracy of any data file (e.g. .doc, .mp3, .mpeg). Some seeders produce MD5 checksums for their WAV files, as well as for their SHN files. This is just an extra level of confirm to ensure exact copies of the original WAV files are being burned from the SHN files. Checking a WAV file with a MD5 cheksum is no different than checking a SHN file. If you use mkwACT, you can just right click on the wav MD5 and choose "verify."

    I just uploaded a directory that contained WAV MD5 checksums, is that OK?

    The WAV MD5 checksums are ignored by our robot and will not cause problems for your recording.

    My failure email is indicating that the text file failed. What can I do?

    Unlike FLAC or SHN, text files do not translate identically from 1 platform to another. Since the archive.org servers run Unix, text files created on other Operating Systems will fail their MD5check. We recommend uploaders remove any text files from their MD5's if they are having this problem.

    When I try to connect to a server via FTP, I get the error 'connection timeout.' How can I fix this?

    This error is caused by a setting in your FTP client, that limits the amount of time your FTP client will wait for a server to respond. In order to fix this problem, increase the "server timeout" setting; a setting of 180 seconds should be enough time to connect to the archive.org servers. If you use SmartFTP, the "server timeout" setting can be found in Tools > Settings > Connections.

    Can bands place restrictions on material to be archived?

    Yes. Each band can tailor the extent of their permission to the Archive. We quote the band's wishes in the Rights section of the band's Collection page. Here are some examples of special restrictions bands have requested. We point out different cases in a band's policy information using a shorthand "Limited Flag" tag.

    We have a contribution system set up to accomodate individual bands' requirements. During the upload process, contributors are urged to double check the band's policy notes at different stages. Archive Curators, volunteer fans who have proven to be in line with the spirit of this archive, will attempt to screen contributions for OK'ed material only. In addition, access to a particular item can be removed if it becomes restricted later (for example, a date newly chosen for commercial release must be removed under some band's policies).

    Bands, please contact us at etree at archive dot org anytime to let us know how we can work with you to make things happen.

    I just uploaded a show and all the files fail the MD5 check, what's the deal?

    Check to make sure the FTP program you used to upload the files is set to "binary" mode. If you try to upload .shn or .flac files in "ASCII" mode the files will fail the MD5 check. ASCII is the standard format for encoding plain text files (actually a subset of binary), while binary is used to encode almost all other types of files. More information on binary vs. ASCII can be found here.

    If this does not solve the problem, be sure that all the file names in the MD5 file match the .shn file names. Be aware that the UNIX system the Internet Archive runs on is case-sensitive.

    If you upload FLAC filesets to the LMA, please follow the naming standards to help the checking program here. Directories should be named with .flac16 or .flac24 suffix, not .flac. Otherwise, the program will report failures.

    Where have all the Dave Matthews Band concerts gone? Will they be back?

    At the request of the band's management and as a result of the band's 2003 policy change, Dave Matthews Band concerts (as well as Dave Matthews solo concerts and Dave and Tim shows) have been removed from the Internet Archive. We're very sorry about this unfortunate turn of events but feel like it is important to honor the wishes of the band and its management.

    For more information and discussion see this post:
    http://www.archive.org/iathreads/post-view.php?id=3670

    Why is there no Phish? What about Widespread Panic?

    Phish has decided not to participate in the Archive at this point in time. Their official response can be viewed here.

    Similarly, Widespread Panic has opted out of the project for the time being. They were last contacted on 11/9/2004. Their response can be seen here.

    I used to use a download manager and now it stopped working. What's the deal?

    Download managers increase your download speed by connecting to the server multiple times. Doing this does not significantly increase download speeds but dramatically hurts the performance of the server. If you wish to use queue to download from the HTTP servers, be sure you set your download program to only use one connection at a time.

    What's the deal with magic number errors?

    If you get a magic number error when listening to or decoding a SHN file, the SHN file is most likely corrupt. First, make sure the SHN file passes MD5 verification; if it does not, redownload the file. If the file passes MD5 verification and you are still getting the magic number error, leave am error report via the show details page noting the magic number error and which track the error occurs on. Hopefully others who have download the show will confirm or deny the error. If the error occurs for all downloaders, the seeder will be contacted to provide a new, uncorrupted track. Please note that there is nothing the Internet Archive administrators can do about a magic number error, becuase the only solution to the error is re-encoding the SHN file from the original WAV file.

    Do you provide an RSS feed of new updates to the LMA?

    Indeed! The URL of the feed is http://www.archive.org/services/collection-rss.php?mediatype=etree&collection=etree You can plug this into a front end like AmphetaDesk (available at: http://www.amphetadesk.com)

    What does the 'Transferred by' field mean?

    This field indicates the person who did the original DAT/MD/Cassette to WAV conversion. Also, note that in the case of recordings made directly to laptops there is no transfer.

    Why don't I get an email when my uploads fail MD5 checksums?

    The system currently only sends emails when MD5 files are included. This means that, if you're uploading FLAC files, you still need to generate and include an MD5 file if you want to receive informational emails about the failures.

    A recommended tool for creating these files is MD5summer. Please note that before uploading the MD5 created with this tool you should open the MD5 in a text editor and remove the top 3 lines so the first signature is now flush with the top of the file.

    Can I log into an FTP server to download concerts?

    Yes, you can log into iaXX.us.archive.org (where XX is a number), with the username anonymous and use your email address as the password. Each recording's details page will have a link for FTP that will tell you which number server the show is on, and in which directory. Here is a thread with an example.

    My in-progress upload says ' No metadata describing files found. Waiting for user to enter metadata' - what do I do?

    There are 2 XML files that get created during the import of any recording in the collection:

    showfolder_meta.xml
    showfolder_files.xml

    The first file gets created when you submit the import form to the collection. If that file does not exist, you can create it by editing the details page and clicking Update.

    The second file gets created by filling out File Options. Just click the link on the left side of the details page and fill out the form as accurately as you can.

    If either of these files are missing, your Contribution may give you this message. Please note that once the files get created, it takes 5-10 minutes before the system notices them and moves on to the next stage.

    Can I upload live recordings that were broadcast on XM Radio or Sirius Satellite Radio?

    At this point in time, Archive.org cannot host recordings that were broadcast over either of these services. Subscribers have informed us that they were required to sign a "Terms of Use" document that forbids the recording/hosting/rebroadcasting of any material received from these services. Until we hear otherwise, these recordings cannot be hosted here.

    The Grateful Dead is here, when will we see Jerry Garcia recordings?

    The taping policy of the Grateful Dead does not extend to recordings of Jerry Garcia's other lineups. Jerry's solo work is controlled by his estate. Representatives have said No to the idea of hosting shows in the Live Music Archive.

    Regarding removing the lossy files ... I edited my show, checked the box to remove them and clicked update. Now when I click update again, the box is still not checked. Why?

    It takes 2-10 minutes for your checking of that box to 'stick' ... see this discussion board post: http://www.archive.org/iathreads/post-view.php?id=22816 for an explanation of why.

    The upload instructions require a 'FLAC Fingerprint' file with my recording - how can I create this?

    In Windows:

    1. Open FLAC Frontend
    2. Drag all of the FLAC files of your recording into Flac Frontend window. (you can also use the "add" button to do this)
    3. Click the "Fingerprint" button.
    4. Save the fingerprint file with a name like this: bandYYYY-MM-DD.ffp

    I've got a great 'filler' for the recording I am about to upload to the collection - should I include it?

    A 'filler' is music from a different performance in addition to the main recording, typically used to fill up extra space on a CD. Sometimes the filler is a different artist, other times it is the same artist, but a different show and date.

    While this is convenient for burning full CD's, it is not appropriate to include fillers on recordings here in the collection since they get filed under the artist and date of main performance. Please only include the performance for the artist and date you are importing. Fillers should be filed under their own entries elsewhere in the collection.

    Where can I find other recordings by [trade-friendly band] that aren't in the collection?

    If the artist is OK with Internet trading, you may be able to find downloadable recordings through http://bt.etree.org or http://www.furthurnet.net. Also, check http://db.etree.org to find people who have copies of shows and who may be willing to trade. Etree.org has additional trading forums at http://forums.etree.org Lastly, you can check out a band's own fan forums and mailing lists. Good luck!

    In contrast, the Live Music Archive forum at the Internet Archive is not a good place to post about trades, or to ask for shows that are not yet archived here, whether or not the band presently has a section here. Moderators may delete these posts. More posting etiquette tips for that forum are here.

    What are SHN files and how can I listen to them?

    SHN stands for shorten. It is a lossless compression algorithm for digital music. It was developed by SoftSound and it compresses music files to 50-60% of their original size, with no loss in quality. See this FAQ.

    To listen to SHN files:

    Macintosh: Download and install MacAmp Lite, a multi-format audio player, and then install the Shorten Plugin for MacAmp.
    Windows: Download and install WinAmp, a multi-format audio player, and then install the ShnAmp Plugin for WinAmp.
    Linux or any other UNIX-based architecture: Download and install the xmms-shn plugin for the XMMS media player.

    I tried downloading a show and I got a '403 Forbidden' page. Why?

    As part of the new (as of May 2007) QA/QC checks that the archive conducts on shows that are uploaded, more refined checks are conducted on shows. For more detail, see this forum post: http://www.archive.org/iathreads/post-view.php?id=124098 What happens though, when a show either fails it's md5 check, it's internal flac checksum check, or is missing an info.txt file, every non .xml file in the show fileset (the flac files, the mp3's, etc) all become non-downloadable. If you try and click any of the music files, you will be taken to a webpage titled "403 Forbidden" that will say: "Forbidden You don't have permission to access "ARCHIVE.ORG_Server/show_location/file" (specific to your show file) on this server. **** What this means is that the uploader has a problem with their show files, and as a measure to 'stop the spread' of bad files, the system is preventing people from downloading until the uploader contacts the archive to fix the show. If you as a user find a show that has the above problem, please check back later and once the uploader has fixed the problem, the show will be downloadable as normal.

    How do I upload a show to the LMA?

    As of 5/2006, the upload method has changed significantly. Here is a walkthrough in PDF with screenshots. Another text description is here.

    Before uploading any show,